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^SJ KS'-algebra consists of expressions constructed with four kinds op- 

erations, the minimum, maximum, difference and additively homoge- 

C/3 neous generalized means. Five families of Z-classifiers are investigated 

C/3 

O on binary classification tasks between English phonemes. It is shown 

that the classifiers are able to reflect well known formant characteris- 

> 

tics of vowels, while having very small Kolmogoroff 's complexity. 

O 1 Introduction 

m 



>. In our previous paper in the series we have proposed a new iv'.S'-algebra 



for constructing binary phoneme classifiers based on spectral content. The 



algebra consists of expressions constructed from a vector of spectral values 
s = (si, . . . ,s n ), and the zero value by means of the following operators 
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• the minimum min(a;i, . . . , x n ), 

• the maximum max(xi, . . . , x n ), 

• the difference x\ — x 2 , 

• the additively homogeneous means A a , 
where 

AJx-l, ...,x n )= ln^M Q (exp(xi), . . .,exp(x n ))\ 
and M a is the generalized mean 

-xf + -- - + a£\V« 



M a (xi, ...,x n ) 



n 



In this article we shall present results of search for optimal Z-classifier in a 
large, albeit special family of elements of i^S'-algebra. 

2 Optimization setup 

For dataset we shall use spectral data presented in pQ and used for demon- 
stration in [2] and [3J. The data is derived from TIMIT database, often 
used in speech recognition tasks. It consists of 5 English phonemes, three 
vowels aa, ao , iy and two consonants del , sh, each pronounced by a male 
speaker from various geographical regions. The sound was sampled at 16kHz, 
and spectral data was prepared using 512-sample window, resulting in 256 
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spectral vector for each sample. The data is divided into train and test 
categories, with approximately equal proportions in each. 

Let us recall that a general Z-classifier for phonemes 0i, 02 corresponds to 
an element / of ifS'-algebra. Suppose the classifier is presented with spectral 
data s and prior knowledge that the data corresponds to either 0! or 2 . It 
decides that phoneme is 0i if /( s ) < and decides that the phoneme is 02 if 



For an optimization criterion we chose the number of successful classifi- 
cations c(f) on the training data set. Since the data set is rather small, ties 
may occur. In the case of ties, we choose the classifier that maximizes the 
expression 



where Oi are sample means and standard deviations for values of / on the 
set of training samples of phoneme 0j. The expression plays role analogous 
to that of Fischer's linear discriminant. 

Since there is no obvious shortcut to finding an optimum, we resort to 
evaluating classification performance in turn for every classifier in a given 
family. 

3 Families of classifiers 

By a spectral range we mean a sequence Rij = s i+ i, . . . , Sj) of consec- 
utive spectral amplitudes (ordered by increasing frequency). Since brain 
structures devoted to speech recognition are tonotopically organized [4], we 



f(s) > 0. 
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propose to use for discrimination functions defined on spectral ranges. Each 
discrimination function takes the form 

/ = fi(Ri,j) - fziRkj), 

where f\, f'2 are symmetric, additively homogeneous functions of KS-algebra.. 
The difference of values of fx and fi is then intensity invariant. We distin- 
guish five different classes of such functions 

1. the mean of values in the spectral range 

2. the mean of m largest values in the spectral range 

3. Ai average of m largest values in the spectral range 

4. A 2 average of m largest values in the spectral range 

5. a quantile of the spectral range (the m-th largest value) 

Obviously, family 1 is a subset of family 2. Families 2-5 can be seen as 
special cases of a family obtained by taking average A a of m largest values 
(a = 0,l,2,—oo respectively). Families 1, 2 and 5 are special cases of so 
called OWA-operators [5j. A general n-ary OWA operator F with weight 
vector w = (wi, . . . , w n ) is defined by expression 

F(ai, . . . , a n ) = iOifo a H h w n b n , 

where 6j is the i-th largest element of the set {a\, . . . , a n }. 

If one limits oneself to searching over pairs of functions defined on ranges 
of width up to w, the complexity of selecting the best one is ~ kNw 6 in 
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families 2-5, and ~ k'Nw 4 in family 1, where N is the number of samples in 
the training set. Despite using optimized software, the former growth is still 
quite large, and we opt to search only through a fixed number of values of m 
in families 2-5 that includes m = 1, m = w, and m close to u>/4, w/2 and 
3w/4 respectively. 



4 Trainability 

Trainability of classifiers is their ability to capture class distributions over 
training data. Significant failure to classify training data is an indication 
that the classifier is not flexible enough. Training errors for each class of 
classifiers is shown in Table [U 
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Table 1: Total number of errors on training data of the best classifiers in the 
family given in the column for a pair of phonemes given by the row 

From the table we can see that family 1 of classifiers is least trainable, 
which can be expected, since it is subsumed by family 2. On the other hand, 
family 2 is slightly more trainable than others. 
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5 Performance on test data 



The crucial characteristic of any classifier is its performance on the test data. 
Table [2] summarized results of best classifiers within a given family (priority 
is given to c(f) and p(f) is used in case of ties). We can again see that 





1 


2 


3 


4 


5 


Family 2 pctg. 


aa-ao 


95 


91 


96 


98 


94 


79.95% 


aa-dcl 

















100% 


aa-iy 

















100 % 


aa-sh 


2 


1 











99.75% 


ao-dcl 








1 








100% 


ao-iy 








1 


2 


1 


100 % 


ao-sh 


1 














100 % 


dcl-iy 


30 


16 


24 


25 


34 


96.84% 


dcl-sh 


1 


1 











99.76% 


iy-sh 


5 


1 


2 


2 


1 


99.81% 



Table 2: Total number of errors on test data of the best classifiers in the 
family given in the column for a pair of phonemes given by the row 

family 2 of classifiers provides the best testing performance. Intriguing is poor 
performance of family 5 on discrimination of del versus iy. In fact, there is a 
simple classifier in family 5 with only 24 errors on test data. Putting priority 
on the correct train count c(f) rather than on p(f) resulted in reporting 
performance of poorer classifier (del . iy . 2 . disc in R code listing below) 
rather than the better one (del . iy. 1 .disc). 

del . iy . 1 . left . 1 = function(x) max(x[2:6]) 
del . iy . 1 . right . 1 = function(x) max(x [1 : 1 3] ) 

dcl.iy.l = function(x) del . iy . 1 . left . 1 (x) — del . iy . 1 . right . 1 (x) 
del . iy . 1. disc = function(x) if (dcl.iy.l(x) < 0) "iy" else "del" 
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del . iy . 2 . left . 1 = function(x) max(x[l:4]) 
del . iy . 2 . right . 1 = function(x) Q(x,8,14,5) 

dcl.iy.2 = function(x) del . iy . 2 . left . 1 (x) — del . iy . 2 . r ight . 1 (x) 
del . iy .2. disc = function(x) if (del . iy .2 (x) < 0) "iy" else "del" 



Q= function (x , a , b , p ) { 



sort (x [ a : b ] ) ; return (v [p] ) } 



6 Visualization 

The families of discriminators we have examined in this article can be readily 
visualized. 
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Figure 1: Supports and position of parameter m (white circles) of optimal 
classifiers of a given width in family 5. In the right picture locations of 
average frequencies for formats F2 for ao (light blue) and iy (light green) are 
indicated by vertical lines. The average values were taken from [6]. 



In Figure [T] we can see how support of j\ and changes if we allow 
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increased size of its support. In the right picture we can see that support of 
components of / matches well with formants. 



7 Conclusion 

We have conducted a search for structure of optimal Z-classifiers in 5 families 
of functions in KS-aAgebra. Among families we considered, slightly better 
results on both training and test data were obtained in family 2. We have 
demonstrated that classifiers found by our procedure reflect well known for- 
mant concept. Advantages of these classifiers include clear interpretation, 
visualizations and lack of any continuously varied parameters resulting in 
low Kolmogoroff 's complexity. 

Further research should investigate more general classes of i^S-algebra 
based classifiers, namely B and A-classifiers, adjustments for psychoacoustic 
phenomena, and develop means to compose single feature classifiers. 
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